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Abstract 

Response-adaptive designs have been extensively studied and used 
in clinical trials. However, there is a lack of a comprehensive study of 
response-adaptive designs that include covariates, despite their impor- 
tance in clinical experiments. Because the allocation scheme and the 
estimation of parameters are affected by both the responses and the 
covariates, covariate-adjusted response-adaptive (CARA) designs are 
very complex to formulate. In this paper, we overcome the technical 
hurdles and lay out a framework for general CARA designs for the 
allocation of subjects to K(> 2) treatments. The asymptotic proper- 
ties are studied under certain widely satisfied conditions. The proposed 
CARA designs can be applied to generalized linear models. Two impor- 
tant special cases, the linear model and the logistic regression model, 
are considered in detail. 

1. Preliminaries. 

1.1. Brief history. In most clinical trials, patients accrue sequentially. 
Response-adaptive designs provide allocation schemes that assign different 
treatments to incoming patients based on the previous observed responses of 
patients. A major objective of response-adaptive designs in clinical trials is 
to minimize the number of patients that is assigned to the inferior treatment 
to a degree that still generates useful statistical inferences. The ethical 
and other characteristics of response-adaptive designs have been extensively 
discussed by many authors (e.g., Zelen and Wei (1995)). 
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Early important work on response-adaptive designs was carried out by 
Thompson (1933) and Robbins (1952). Since then, a steady stream of re- 
search (Zelen (1969), Wei and Durham (1978), Eisele and Woodroofe (1995)) 
in this area has generated various treatment allocation schemes for clinical 
trials. Some of the advantages of using response-adaptive designs have been 
recently studied by Hu and Rosenberger (2003) and Rosenberger and Hu 
(2004). 

In many clinical trials, covariate information is available that has a strong 
influence on the responses of patients. For instance, the efficacy of a hyper- 
tensive drug is related to a patient's initial blood pressure and cholesterol 
level, whereas the effectiveness of a cancer treatment may depend on whether 
the patient is a smoker or a non-smoker. 

The following notations and definitions are introduced to describe the 
randomized treatment allocation schemes. Given a clinical trial with K 
treatments. Let Xi,X2,... be the sequence of random treatment assign- 
ments. For the m-th subject, X m = (X mj i, . . . , X^k) represents the as- 
signment of treatment such that if the m-th subject is allocated to treat- 
ment k, then all elements in X m are except for the A;-th component, 
X m: k, which is 1. Let iV m ,fc be the number of subjects assigned to treat- 
ment k in the first m assignments and write N m = (N m i, . . . , N m K). Then 
■Wm = YliLi Xi- Suppose that {Ym^, k = 1, . . . , K, m = 1, 2 . . .} denote 
the responses such that Y m ^ is the response of the m-th subject to treat- 
ment k, k = 1, . . . , K. In practical situations, only Y m ^ with X m ^ = 1 is 
observed. Denote Y m = (Y mj i, ... , Y mjK )- Let X m = a(X 1 , X m ) and 
y-m = <?(Yi, • • • , Y m ) be the corresponding sigma fields. A response-adaptive 
design is defined by 

"0m = E(x m \x m - 1 ,y m -i). 

Now, assume that covariate information is available in the clinical study. 
Let £ m be the covariate of the m-th subject and Z m = <t(£l, . . . ,£ m ) be 
the corresponding sigma field. In addition, let J- m = a(X m ,y m , Z m ) be the 
sigma field of the history. A general covariate- adjusted response-adaptive 
(CARA) design is defined by 

the conditional probabilities of assigning treatments 1, ...,K to the mth pa- 
tient, conditioning on the entire history including the information of all 
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previous m — 1 assignments, responses, and covariate vectors, plus the in- 
formation of the current patient's covariate vector. 

A number of attempts have been made to formulate adaptive designs 
in the presence of covariates. For example, Zelen (1974) and Pocock and 
Simon (1975) considered balancing covariates by using the idea of the biased 
coin design (Efron, 1971). Atkinson (1982, 1999, 2002) tackled this problem 
by employing the D-optimality criterion with a linear model. The prime 
concern of these works is to balance allocations over the covariates with 
treatment assignment probabilities 

tym = E(X m \X m -i, Z m ) 

which differs from the CARA designs. These allocation schemes do not 
depend on the outcome of the treatment which is important for adaptive 
designs that aim to reduce the number of patients that receive the inferior 
treatment. 

The history to incorporate covariates in response-adaptive designs is 
short. For the randomized play-the-winner rule, Bandyopadhyay and Biswas 
(1999) incorporated ploytomous covariates with binary responses. Rosen- 
berger, Vidyashankar and Agarwal (2001) considered a CARA design for 
binary responses that use a logistic regression model. Their encouraging 
simulation study indicates that their approach, together with the inclusion 
of the covariates, significantly reduces the percentage of treatment failures. 
However, theoretical justifications and asymptotic properties have not been 
given. Further, the applications of their procedure are limited to two treat- 
ments with binary responses. 

To compare two treatments, Bandyopadhyay and Biswas (2001) con- 
sidered a linear model to utilize covariate information with continuous re- 
sponses. A limiting allocation proportion was also derived in their design. 
However, according to their proposed scheme, the conditional assignment 
probabilities are 

ifim — ■E(X. m \J~, i m— l)' 

The above probabilities do not incorporate the covariates of the incoming 
patient, which in some cases are crucial. For instance, let the covariate 
be gender and there are two treatments, and we assume that male and fe- 
male patients react very differently to treatments A and B. Whether the 
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next patient is male or female should therefore be considered as an impor- 
tant element that affects the assignment of treatment. Recently, Atkinson 
(2004) considered adaptive biased-coin designs for K-treatment based on a 
linear regression model. Atkinson and Biswas (2005a and 2005b) proposed 
adaptive biased-coin designs and Bayesian adaptive biased-coin designs for 
clinical trials with normal responses. However, none of these articles pro- 
vided asymptotic distribution of the estimators and allocation proportion. 
Without the asymptotic properties of the estimators, it is difficult to assess 
the validity of the statistical inferences after using CARA designs. 

Instead of working on specific setups, we seek to derive a general frame- 
work of CARA designs and provide theoretical foundation for using CARA 
design. In a CARA design, the assignment of treatment X m depends on 
T m -\ and the covariate information (£ m ) of the incoming patient. This 
generates a certain level of technical complexity. However, it is important 
to provide a solid foundation (including asymptotic normality) for CARA 
designs that can be usefully applied in many circumstances. 

1.2. Main results and organization of the paper. The main objectives 
are (i) to propose a general CARA design that can be applied to cases in 
which K-treatments (K > 2) are present and to different types of responses 
(discrete or continuous), and (ii) to study important asymptotic properties 
of the CARA design. These properties provide a solid foundation for both 
the CARA design and the statistical inference after using a CARA design. 
Major mathematical techniques, including martingale theory and Gaussian 
approximation, are employed to develop the asymptotic results. 

The rest of the paper is organized as follows. In Section 2, we intro- 
duce the general framework of the CARA design. Useful asymptotic results 
(including the strong consistency and asymptotic normality) of both the 
estimators of the unknown parameters and the allocation proportions are 
derived. The generalized linear model represents a broad class of applica- 
tions and is an important tool in the analysis of data that involve covariates. 
In Section 3, the CARA design is applied to generalized linear models, and 
two important special cases, the linear model and the logistic regression 
model, are considered in detail. Under the general framework, we are able 
to propose many new and useful CARA designs. We then conclude our pa- 
per with some observations in Section 4. Technical proofs are provided in 
the Appendix. 
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2. General CARA design. 

2.1. General framework. Based on the notation in Section 1, supposing 
that a patient with a covariate vector £ is assigned to treatment k, k = 
1, . . . , K , and the observed response is Y k . Assume that the responses and 
the covariate vector satisfy 

E[n|£] =»(0fc,O, Ok£®k, k = l,...,K, 

where p k (-, •), k = 1, . . . , K, are known functions. Further, 6 k , k = 1, . . . , K, 
are unknown parameters, and & k C M rf is the parameter space of k . Write 
= {6 1, . . . , Ox) and = ©i X • • • X &k- This model is quite general, and 
includes the generalized linear models of McCullagh and Nelder (1989) as 
special cases. The discussion of the generalized linear models is undertaken 
in Section 3. We assume that {{Y m ,ii . . . , Y mj K, £ OT ), m = 1,2,...} is a 
sequence of i.i.d. random vectors, the distributions of which are the same 
as that of (Yi, . . . , Yk, £). 

2.2. CARA design. The allocation scheme is as follows. To start, 
assign tuq subjects to each treatment by using a restricted randomiza- 
tion. Assume that m (m > Km®) subjects have been assigned to treat- 
ments. Their responses {Yj, j = 1, . . . , m} and the corresponding covariates 

j = 1, . . . , m} are observed. We let 6 m = (0 m ,i, . . . , Q m ,K) be an esti- 
mate of 9 = (0i, . . . ,6 K ). Here, for each k = 1, . . . , K, m>k = 6 m ,k{ Y 3,k^3 '■ 
Xjk = l,j = 1, . . . , m) is the estimator of 6 k that is based on the observed 
N mk -size sample {(Yj k , £j) : for which Xj >k = 1,J = 1 • • • ,m}. Next, when 
the (m + l)-th subject is ready for randomization and the corresponding 
covariate £ m +i is recorded, we assign the patient to treatment k with a 
probability of 

ipk = P(X m+ i yk = l|.F m ,£ m+ i) = 7r fc (0 m ,£ m+ i) k = l,...,K, (2.1) 

where T m = cr(Xi, . . . , X m , Yi, . . . , Y m , £i, . . . , £ m ) is the sigma field of 
the history and 7Tfc(-, •), k = 1,...,K are some given functions. Given 
T m and £ m +i, the response Ym+i of the (m + l)-th subject is assumed 
to be independent of its assignment X m+ i. We call the function 7r(-,-) = 
("7Ti(-, •),..., 7Tftr(-, •)) the allocation function that satisfies n\ + ■ ■ ■ + ttk = 1- 
Let g k (0*) = E[vr fc (0*,£)]. From (f2~TJ> . it follows that 

P(A m+ i, fc = l|jF m ) =g k m ), k = l,...,K. (2.2) 
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Different choices of 7r(-,-) generate different possible classes of useful 
designs. For example, we can take 7Tfc(0,£) = R k (0i£ T , ...,6k$, T ), k = 
1, . . . , K, which includes a large class of applications. Here, < R k (z) < 1, 
k = 1, . . . , K, are real functions that are defined in R K with 

K 

Rk(z) = 1 and Ri(z) = Rj(z) whenever Z{ = Zj. (2.3) 

fc=i 

For simplicity, it is assumed that £ and k , k = 1,...,K have the same 
dimensions, otherwise, slight modifications are necessary (see Example 3.1 
for an illustration). In practice, the functions Rk can be defined as 

where G is a smooth real function that is defined in R and satisfies < 

G(z) < oo. An example is that R k (z) = e Tzk /(e Tzi H + e TzK ), k = 

1,...,K. 

In the two-treatment case, we can let Ri(z±,Z2) = G{z\ — z-i) and 
-^2(^1,^2) = G(z2 — zi), where G is real function defined on R satisfying 
G(0) = 1/2, G{-z) = 1 - G(z) and < G(z) < 1 for all z. For the logistic 
regression model, Rosenberger, Vidyashankar and Agarwal (2001) suggested 
using the estimated covariate-adjusted odds ratio to allocate subjects, which 
is equivalent to defining Rk(z±, Z2) = e Zk /{e Zl +e 22 ), k = 1,2. For each fixed 
covariate £, we can also choose 7r(0,£) according to Baldi Antognini and 
Giovagnoli (2004) and Hu and Rosenberger (2003). When 7r(0,£) does not 
depend on £, one can use the allocation scheme of Bandyopadhyay and 
Biswas (2001) for the normal linear regression model. We now introduce 
some important asymptotic properties. 

2.3. Asymptotic properties. Write 7r(0*, x) = [tti{6* , x), . . . , ttk{0*, a?)) , 
g(0*) = ( gi (O*),...,g K (0*)), v k = g k (0) = E[* k (0,Z)], k = 1, . . . , K, and 
v = (v±, . . . , vk)- We assume that < v k < 1, k = 1, . . . , K. For the 
allocation function ir{6*,x) we assume the following condition. 

Condition A We assume that the parameter space & k is a bounded domain 
in M. d , and that the true value k is an interior point of ® k , k = 1, . . . , K. 

1. For each fixed x, ir k (0*,x) > is a continuous function of 6* , k = 
1,...,K. 
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2. For each k = 1,...,K, TT k (0*,£) is differentiable with respect to 0* 
under the expectation, and there is a 5 > such that 

g k {0*) = 9k {e) + (o* - o){^\f + o (||0* - e\\ 1+s \ 

where dg k /dO* = (dg/dO^,. . . , dg/d9* Kd ). 
Theorem 2.1 Suppose that for k = 1, . . . , K , 
1 n 

Onk-Ok = ~ Y, X ™,khk{Y m ,k,tm){l + o(l)) + o(n~ 1/2 ) a.s., (2.4) 



n 

m=l 



where h k are K functions with E[h k (Y k , £)|£] = 0. We also assume that 
E\\h k (Y k ,£)\\ 2 < oo, k = 1,...,K. Then under Condition\^ we have for 
k = l,...,K, 

P(^n,fc = 1) -> v k ; P(X n ^ k = l|.F n _i,£ n = x) -> TT k (6,x) a.s. (2.5) 
and 



^-v = 0^ l °^ gn ) g.s, e n -e = 0^ l0gl ° n gn ). (2.6) 

Further, let V k = E{w k (0,$)(h k (Y k ,t)) T h k (Y k ,S)}, k = l,...,K, 

V = diag(V 1 ,...,V K ), ^ 1 = diag(v)-v T v, S 2 = £f =1 §§;Vk[§§;f and 

£ = Si + 2S 2 . Then, 

y/n(N n /n-v) £]V(0,E) and y/n(jB n -0)%N(0,V). (2.7) 

Remark 2.1 Condition {2.4}) depends on different estimation methods. In 
the next section, we show that it is satisfied in many cases. 

Theorem 12.11 provides general results on the asymptotic properties of 
the allocation proportions N nk /n, k = 1,...,K. Sometimes, one may be 
interested in the proportions for a given covariate (for discrete £) as discussed 
in Section 3. Given a covariate x, the proportion of subjects that is assigned 
to treatment k is 

ELi / Um = a;} ' N n ( x y 

where N n u x is the number of subjects with covariate x that is randomized to 
treatment k, k = 1, . . . ,K , in the n trials, and N n (x) is the total number of 
subjects with covariate x. Write N n \ x = (N n M x , . . . , A^^ia,). The following 
theorem establishes the asymptotic results of these proportions. 
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Theorem 2.2 Given a covariate x, suppose that P(£ = x) > 0. Under 
Conditional and {2.$ , we have 

N nM jN n {x)-nr k {0,x) a.s. k = l,...,K (2.8) 

and 

0V^) (N nlx /N n (x) - tt(0, x)) % N(0, S ||B ), (2.9) 

w/iere 

= dia 5 (7r(0 ! a ; ))-7r(0, a; ) T 7r(0 ! a ; )+2^^^V fe (^^) T p^ = x). 

k=l 

3. Generalized linear models. 

In this section, the general results of Section 2 are applied to the gener- 
alized linear model (GLM) and its two special cases, the logistic regression 
model and the linear model (refer to McCullagh and Nelder (1989) for ap- 
plications of these models). Suppose, given £, that the response Y k of a trial 
of treatment k has a distribution in the exponential family, and takes the 
form 

fk(yk\€,0k) = exp {(yfc/Ufc - a k (fi k ))/(p k + b k (y k ,cp k )} (3.1) 

with link function fi k = h k (£6l ), where k = (6 k i, . . . , 9 kd ), k = 1, . . . , K, 
are coefficients. Assume that the scale parameter (f> k is fixed, then E[Yfc|£] = 
a 'fc(Wc)> Var(y fc |£) = a k (Vk)4>k and 

9 log f k (y k \£,O k ) _ 1 r , f , T 



-riVk - a' k {fj, k )}ti k (€0 k )$, 



d 2 io g My k \t0k) = i_^_ a ^ k)[me i)f + [yk - aMKwtyet. 

Thus, given £, the conditional Fisher information matrix is 



'k j & 

For the observations up to stage m, the likelihood function is 

m K Km K 

no) = n u.\fkor jtk \tj,o k )] x s>» = n ii\fk(X3,k\ti>°k)] Xi ' k ■= u Lk ^ 

j=l k=l k=l j=l k=l 



s 



with logL fe (0 fc ) oc Y%Li x j,k{ Y 3,k ~ a fc(Mj,fe)}) H,h = hk(Pl£j), k = 
1,2,..., K.^The MLE m = (0 m ,i, . . . , fl^) of = (^, . . . , 9 K ) is that 
for which m maximizes L(6) over 6 @i x • • • x 0^. Equivalently, mj fc 
maximizes over k £ @fc, A; = 1,2,..., if. Rosenberger, Flournoy and 
Durham (1997) established a general result for the asymptotic normality of 
MLEs from a response-driven design. Rosenberger and Hu (2002) gave the 
asymptotic normality of the regression parameters from a generalized linear 
model that followed a sequential design with covariate vectors. These two 
papers neither examined the case of using covariates to adjust the design, 
nor established the asymptotic properties of the allocation proportions. The 
next corollary gives results on both the estimators of the parameters and 
the allocation proportions. 

Corollary 3.1 Define 

I k = I k (0) = E{K k {0 : £)I k {O k \£)}, k = l,...,K. (3.2) 

Under Conditional if the matrices Ik, k = 1,2...,K, are nonsingular 
and the MLE m is unique, then under regularity condition (A. 13) in the 
Appendix, we have \2.5\) . \2.b}) . and \2. 7| ) with V k = I k , k = 1,...,K. 
Moreover, if P(£ = x) > for a given covariate x, then \2.fy) and I2.fj\) 
hold. 

This result is a corollary of Theorems 12.11 and 12.21 The proof is given in the 
Appendix through the verification of Condition (|2 . 4|) . For both the logistic 
regression and the linear regression, condition (A. 13) is satisfied. 

Remark 3.1 From Theorem HO it follows that 

0Vn^ (»„,* ~ k ) % N(0, v k {E[Kk(0, a)Ik{0k\£,)]Y l ) > k = 1, . . - , K(3.3) 

It should be noted that the asymptotic variances are different from those of 
general linear models with a fixed allocation procedure. For the latter, we 
have 

y/N^k (9n,k ~ k ) ^ iV^EIIfcCflfclO]}" 1 ), k = l,...,K. (3.4) 

If the allocation functions 7Tfc(0,£) do not depend on £, then irk(0,£) = 
9k(0) = Vk, and so 13. 3\) and {3.$ are identical. Our asymptotic variance- 
covariance matrix of 6 n is also different from that in Theorem 2 of Baldi 
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Antognini and Giovagnoli (2004), because the allocation probabilities in their 
study do not depend on the covariates. 



Remark 3.2 When the distribution of £ and the true value of 6 are known, 
the values of v = E[tt(0,£)], Og/dO k = E[d-K(G,£)/dG k ] and I k in HHP 
can be obtained by computing the expectations, and then the values of the 
asymptotic variance- covariance matrices V , S and Si^ can be obtained. In 
practice, we can obtain the estimates as follows. 

(a) Estimate I k by T n>k = ~Y^=\Jtm,kh(9nJ;\£m), k = 1,2,..., IT; 
and then the estimator ofVisV n = diag(l~\, . . . ,I~ K )- 



(b) Estimate Si and Jj^-, respectively, by 



N ; ,N n . f N n , T N n ^ dg 1 ^ 9*r(0*,&, 
Si = diag( ) — { ) an d ~ 



n n ' n d6 k n ' dOf 

m=l K 



I 0* —6 n 



(c) Define the estimator So/S6j/S = Si + ^J2k=i m;^n,k{-^) T ■ 

(d) For a given covariate x, we can estimate Si^ by 

E|a, = diag(iv(O n ,x)) - ir(O n ,x) T ir(0 n ,x) 
+ z 2^k=i [ dot 



f> , dTT(6\ X ) 

e* = e n j Vn ' k V de t 



9* —8 n 



T 

#{m<n:g m =a;} 



Notice that (ft k I k (6\£) does not depend on <j) k . When <j) k is unknown, we can 
estimate I k in the same way after replacing <j) k with its estimate (j) k . 

We now consider two examples, the logistic regression model and the 
linear model. 

Example 3.1. Logistic Regression Model. We consider the case of 
dichotomous (i.e., success or failure) responses. Let = 1 if a subject 
being given treatment k is a success and otherwise, k = 1, . . . ,K. Let 
Pk = Pk(6k,£) = P(^fc = be the probability of the success of a trial of 
treatment k for a given covariate £, q k = 1 — Pk, k = 1, . . . , K. Assume that 

logit(p fe ) = a k + k £ T , k = 1, . . . , K. (3.5) 
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Without loss of generality, we assume that a k = 0, k = 1,2, ... ,K, or 
alternatively, we can redefine the covariate vector to be (1,£). For each 
k = 1, . . . , K, let pj t k = p k (0 k ,£ k ). With the observations up to stage m, 
the MLE m ^ k of 6 k (k = 1, . . . , K) is that for which 9 m ^ k maximizes 

m 

L * = : I1/'^ V " (I - Pj,k) X > M1 - Y >- k) over e k G & k . (3.6) 

The logistic regression model is a special case of GLM (|3.1j) with ^ = 1, 
^ = log{pk/qk), h k (x) = x, b k (y k ,4>k) = 0, and a k (fj, k ) = -log(l - p k ) = 
log(l + e /ifc ). Thus, given the conditional information matrix is I k (0 k \^) = 
a ki^k)$, T ^, = PkQk£ T £- For Theorem 13.11 we have the following corollary. 

Corollary 3.2 Suppose that Condition{fl\is satisfied, E\\£\\ 2 < oo, and the 
matrix E[£ r £] is nonsingular. We then have \2.hl) . \2. 6\) . \2.% with V k 
I and I k = E{Tr k (6, £,)p k q k £ T €}, k = l,...,K. Moreover, if P(£ = x) > 
for a given covariate x, then H2.8\) and \2. 9\) hold. 

Example 3.2. Normal Linear Regression Model. The responses are nor- 
mally distributed, that is, Y k \^ ~ N(fi k ,a k ) with link function fi k = k £ T , 
then the linear model is a special case of GLM l)3.1j) with cp k = o~\, a k (iJ, k ) = 
fJ- k /2 and h k (x) = x. Thus, we have the following corollary. 

Corollary 3.3 Suppose that the conditions in Corollary \H.2\ are satisfied. 
We then have Wlty . VH\) . flTfl with V k = I^ 1 and I k = E{Tr k (e,$)^ T ^]/a 2 k , 
k = 1, . . . , K. Moreover, if P(£ = x) > for given x, then 12. 8\) and \2.9\) 
hold. 

Remark 3.3 Bandy opadhyay and Biswas (2001) considered the normal lin- 
ear regression model in which On = #21 = ^2, &ij = #2j = Pj-i, 
j = 2, . . . ,d, and the first component of £ is 1. Their proposed allocation 
probabilities are functions of estimates of the unknown parameters that de- 
pend only on information of the previous patients, but not on the covariates 
of the incoming patient. Theorem 1 of Bandy opadhyay and Biswas (2001) 
gives the consistency property of N n \/n and P{X n \ = 1). However, their 
proof is not correct, since the assignments 5\ , ■ ■ ■ ,5i are functions of the 
previous responses Y\, ■ ■ ■ , YJ—i and covariates. In fact, given the assign- 
ments Si, ■ ■ ■ ,Si, the responses Y\, ■ ■ ■ ,Yi are no longer independent normal 
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variables, which implies that their equation (4) is not valid. Nevertheless, if 
we let 

£ = (1,0, a = El 1^ = Var{£}, Vl = <*>(^^) and v 2 = 1 - v u 

under our theoretical framework, it can be proved that Theorem 1 of Bandy- 
opadhyay and Biswas (2001) is correct. Further, it is not difficult to show 
that 



'1/vi +aI~ 1 a T alj 1 ^ 

al~ 1 a T l/v 2 + aI~ l a T ^ 



\/n(#ni- Mi>/^n2- Hi) -> N (0,0), a 1 



and 

/ 9 / \ 2^ 

D at I r> 2,7 I x.'/Vl ~ l"2. 



\ VlV 2 V 

where a 2 is the variance of the errors in the linear model. 

Remark 3.4 Corollaru W. Ml can be generalized to general responses. Suppose 
that the response of a subject to treatment k, k = 1,...,K, and its 
covariate £ satisfies the linear regression model 

E[Y k \£] = p k (0 k ,t) = k £ T , k = 1, . . . , K. 

For the observations up to stage m, let 9 nii k minimize the error sum of 
squares 

m 

Sk(0 k ) =Y, X 3,k{Yj,k ~ Oktf) 2 over k G & k , 
3=1 

k = 1, . . . ,K. Here, Q m ,k is the least-squares estimator (LSE) of Ok- Then 
Corollary Iff. 51 remains true with V k = ly^i^, > I^k = E[ K k{0i£)$, T $.] 
and Iyk = E{irk(0,£)(Yk — #fc£ T ) 2 £ T £} under the condition E\\Yk£\\ 2 < oo, 
k = 1,...,K. This result follows from Theorem \2. 11 as condition \2.J$ is 
satisfied with = (Yj. — 6f : £ T )£I^ l } . 

4. Discussion. 
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This paper makes two major contributions. First, a comprehensive 
framework of CARA designs is proposed to serve as a paradigm for treat- 
ment allocation procedures in clinical trials when covariates are available. It 
is a very general framework that allows a wide spectrum of applications to 
very general statistical models, including generalized linear models as special 
cases. Second, asymptotic properties are obtained to provide a statistical 
basis for inferences after using a CARA design. 

When covariate information is not being used in the treatment allocation 
scheme, an optimal allocation proportion is usually determined with the as- 
sistance of some optimality criteria. Jennison and Turnbull (2000) described 
a general procedure to search for an optimal allocation. For CARA designs, 
the means to define and obtain an optimal allocation scheme is still unclear. 
For CARA design, we can find optimal allocation for each fixed value of the 
covariate. Theorem 2.2 provides theoretical support for targeting optimal 
allocation by using CARA design for each fixed covariate. 

For response-adaptive designs without covariates, Hu and Rosenberger 
(2003) studied the relationship among the power, the target allocation, and 
the variability of the designs. It is important to study the behavior of the 
power function when a CARA design is used in clinical trials. It is not 
difficult to derive the power function for binary responses with discrete co- 
variates. For the general covariate £, the formulation becomes very different, 
and it is an interesting topic for future research. 

APPENDIX: PROOFS 

The proofs of the theorems are organized as follows. First, we prove the 
theorems for the general CARA design in Section 2. We then derive the 
results in Section 3 by the application of the theorems in Section 2. 

Proof of Theorem I2.11 First, notice that for each k = 1, . . . , K, A m+l fc = 
X m +i,k — ^\X m +i,k\^ : m) + 9k{6m) and then 

n n—X 

N n>k = E[X 1>k \F ] + ( x m,k ~ E[X m , fc |JV-i]) + 9k{0 m ). (A.l) 

m=l m=l 

The second term is a martingale. We then show that the third term can be 
approximated by another martingale. Write AM m k = X m ^— E[X m ^| J- m -i], 
AT mjfc = X m>k h k (Y m>k , £m ) > k = 1, ...,K. Let M n = ^ =1 AM m and 
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T n = Em=i AT m, where AM m = (AM m> i, • • • , AM m ^) and AT m = 
(AT mj i, • • • , AT m ^). Here, the symbol A denotes the differencing operand 
of a sequence {z n }, i.e., Az n = z n — z n -\. Then {(M n ,T n )} is a multi- 
dimensional martingale sequence that satisfies 

|AM n , fc | < 1, ||AT n , fc || < \\h k {Y n ^£ n )\\ (A.2) 

and E[|/ij.(l^j fc, ^ n )|| 2 < oo, k = 1, . . . , K. It follows that 

\\M n \\ = 0(y/n) and ||T n || = 0(y/n) in L 2 . (A.3) 

Also, according to the law of the iterated logarithm for martingales, we have 



M n = Oiy^Jn log log n) a.s. and T n = Oiy^Jn log log n) a.s.. (A. 4) 
From (|A.4j) and it follows that 



O n -9 = 0[JWW) a,. (A.5) 



n 

From (|A~T|) . (|A~5|) . (|A~3|) . and Conditional it follows that 

n— 1 if r) 71—1 

N n , k -nv k = M n)k + ^2J2^m,j-Oj)(-^;) T + ^ o(\\G m - 9\\ 1 ' 

m=l j=l 3 m=l 

n K 



M i V" y^ T m,j(l + o(l)) ( dg k , T , 

M ^+2^Z^ (tmt) +°v 



m=l j=l 

+EE— fe T + °(^ 1/2 ) in Probability, 

m=l j=l ■> 



that is, 



m=l j=l ' 
,V2 



G n + o(n i/2 ) in probability, (A.7) 



where 

71 if ™ fl n K " I 

«*. = M » + E E ^(J|> T - m„ + E E a^(^) t E 7- 

m=l j=l J m=l j=l 
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The combination of QA.4JI and (|A.6|) yields 

„ / /—. — : \ 0(^m log logm) /—. — N 

T\ n —nv = (J[^niogiogn)+ y y. = 0\y niogiogn) a.s. 

m=l j=l 

1)2. 5 j) is obvious by noting (|A.5|) and the continuity of 7r (•,£). The proof of 
consistency is thus obtained. Next, we consider the asymptotic normality 
Notice that M n , T n , and G n are all the sum of martingale differences. It is 
easy to verify that the Lindberg condition is satisfied by ()A.2|) . To complete 
the proof it suffices to derive the variances. First, the conditional variance- 
covariance matrices of the martingale difference {AM n , AT n } satisfy 

EKAM/AMJ^j] = diag(g(e n - 1 ))-(g(e n - 1 )) T g(e n ^ 1 ) -> E x in L i; 
E[(AT n>fc ) T AT nifc |^ n _ 1 ] = E[Tr k (x,t)(h k (Y k ,t)) T h k (Y k ,t)] - V k in L i; 

E[(AM nii fAT nJ \F n _i] = for all i,j and E[(AT n>i ) T AT nij |^ n _ 1 ] = 
for all i 7^ j. It follows that Var{T n }/n — > V and 

n n n K , a o 

m=l m=l Z=l j'=l J J 

= n(Ei + 2S 2 ) + o(n)=nS + o(n). 

By the central limit theorem for martingales (Hall and Heyde, 1980), it 
follows that ^/n(G n -0) = n~ l / 2 T n + o(l) % iV(0, V) and 

V^(N n /n -v) = n- l l 2 G n + o(l) ^ AT(0, E). (A.8) 
The proof is now complete. □ 

Proof of Theorem 12.21 First, according to the law of large numbers, we 
have 

n 

- E = -> p (£ = *) a - s - ( A - 9 ) 

m=l 

and 

j n 1 " 

— / y -Xm,/c-f{£m = a?} = — > ( -Xm jfc/{£ m = x] — E[X m k I{£ m = X^Tm-i] 
m=l m=l 



1 n 

H — Trk(Pm-l,x)P(t-m = x) -> vr fc (0,a:)P(£ = x) a.s. 



n 

m=l 
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and thus Q2.8JI is proved. We then consider the asymptotic normality. The 
proof is similar to that of l|A.8|l . The difference lies in the approxima- 
tion of the process by a new 2K dimensional martingale and the calcu- 
lation of its variance-covariance matrix. Define Cnk( x ) := Em=i(^m>'' — 
7r k {0,x))I{£ m = x}. Then, 



Notice (|A.9|) . It is sufficient to prove 



n 



-1/2 



(( nil (x),...,(nM x )) °N(0,H\ a P(£ = x)). (A.10) 



With the same argument as is used to derive (|A.6|) . we can obtain 

n 

C n ,k(x) = ( A Cn,k( X ) ~ E[AC„,fc(aj)|^ n _i]) 

m=l 
n 

+ [■n k (O m -i,x) -ir k (0,x))P(£ = x) 

m=l 

n 

= Y ( A Cm,k( x ) ~ E^Cm^aOI-Fm-l]) 
m=l 

K n 

+ £ £ ^(^^) Tp ^ = ^) + °(- V2 ) in Probability. 

j'=l m=l ^ 

= :G n ^x) + o{n l l 2 ). 



Similar to the proof of ()A.8|) . to complete the proof it suffices to get the 
variance of G n (x) = (G n:1 (x), . . . , G n>K (x)). Let AM Utk (x) = A( n ^(x) - 
^.[/S.C, n) k{x)\J- n -i]. The variance-covariance matrix of the martingale differ- 
ence {(AM„(s),AT n )} then satisfies E[(AM„ ife (a;)) 2 |J : ' n _i] -> 7T k (G,x)(l- 

E[AM nifc (a;)AM nii (a;)|^ n _i] -> -ir k {0, x)^(e, sb)P(€ = x)\/kj^ j, and 
E[AM nifc (a;)AT nii |J r n _i] = Vi, j in L x . It follows that 

Var{G n (a;)} = n [dm<jf(7r(0, x) - tv(0, x) T ir(9, x)P(£ = x) + o(l)] 



+ EEE ^^k- + (^) v« = *) 



m=l Z=l jf=l ^ J 

nSuP^ = x) + o(n). 
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(jA. 10|) is then proved. □ 

PROOF of Corollary 13. 11 By Theorem 12. II it suffices to verify the con 
dition (|2.4|) . Notice that mk is a solution to dlogL k /dO k = 0. The appli 
cation of Taylor's theorem yields 

9k+t(9m,k— Ok) 



d log L k 



+ (6 m ,k — &k 



d log L k 



del 



+ 

^k JO L 



901 



Ok 

d log L k 



dt} 

= PMi) 



where f(x) = f(b) — f(a). Notice that 

a 

d log L k _ ^ v d log f k (Y jjk | ij , 6 k ) 



oe k 



and 



d 2 log L k 

del 



3=1 



3=1 



(A.12) 



d 2 log MYj^j^k) 

de 2 k 



We assume that the following regular condition 



H(5) = : E 



sup 

\\*\\<8 



d 2 io g f k (Y k \te k 



del 



as 5 -> 0. (A.13) 



This regularity condition is implied by the simple condition that a'L h'L are 
continuous and £ is bounded. Under (|A.13|) . one can show that 

< H(6) +o(l) a.s.. 



sup 

IWI<« 



1 3 2 logL fc 


e k +z 


m d0\ 


Ok 



However, 



3=1 



del 



X 3,k 



d 2 log f k (Y jtk \€j,O k 



del 



is a martingale. According to the law of large numbers, 



d 2 log L k 

del 



in 

Y, E [ X >. 



d 2 logf k (Y j;k \^,e k ) 



3=1 



del 



3-1 



+ o(m) 



J2{ E [^k(z,$)I k (e k \^]} z=d]i +o(m) = -ml k + o(rr(Ad.4) 
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The substitution of (lA~12l) and (TOP) into (UTT1) yields 

m(e m , k - e k ){i k + (i) + o{H(\\e m>k - e k \\))} = f^x J logh f^ ek) . 

3=1 

Thus, 



m ,k-0k = —2^ X i,k 4 (l + o(l)) a.s.(A.15) 

3=1 

Notice that 

E[ aiog/ t «-, fc fe,8 t ) and )s 

Hence, Condition (|2.4() is valid. By Theorem l2.il the proof is complete. □ 
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APPENDIX B: Additional PROOFS 



Proof of the existence and consistency of the solution of HA. IIP : It suffices 
to show that, for any S > small enough, with probability one for m large enough we 
have 

logL k (0* k ) < logL k (0 k ), if \\et-e h \\ = S. (B.l) 
The application of Taylor's theorem yields 

- log L k (0t) - - log L k (0 k ) 
m m 



1 aiogLfei , ro , a .l d 2 logL 



80 k \e k ' X " K " K 'm d0\ 
d 2 \ogL k ' 



\e k . 



So with probability one for m large enough, 

1 log L k (0l)-- log L k (0 k ) 
m m 

1 m 

^-«-^){iE{ E K(^)4(9^)]} j4 .J(9|-0 k ) T 

2=1 

+ \\0* k -0 k \\ 2 H{\\0l-0 k \\)+o{l) 

< - \\0l - k f min {y{E[ir k (z,$)I k (O k \$)] }y T \ + \\0* k - k \\ 2 H(\\0l - k \\) + o(l) 

< - c S 2 + 5 2 H(5) + o(l) < uniformly in 0* k with \\0* k - k \\ = 5 
when 5 is small enough. (IB. Ill is proved. □ 

Proof of Corollary EOl Notice 

d 2 \ogf k (Y k \z,0 k ) eTe 

= -Pfc<7fc€ t 



is bounded by ||£|| 2 and is continuous in k . It follows that the regularity condition 1A.13I 
is satisfied due to the domained convergence theorem. On the other hand, it is obviously 
that 



9 *° s 2 Lfc =-i£x j , kPk (0 k ,t j ) qk (o k ,s j )es 



00 

2=1 

is a negatively definite matrix, and so log L k (0l) is a strictly concave function of k . It 
follows that the MLE is unique. Corollary 13 . 21 now follows from Corollary 13. II □ 

Proof of Remark 13.41 It is obviously that S k (0 k ) is strictly convex function of k . 
It follows that the LSE m ,k exists and is unique. On the hand, it is easily seen that m ,k 
is the solution of the normal equation as 

^ m 1 m 

(3m,* - X i,*ti*s] = ~ E X iAYj,k - 0kg)£i. (B.2) 

2=1 2 = 1 
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Also, {Xj }k £j£j ~ E[Xj,k£j£j\Fi-i]} and {Xj,k(Yj t h — Ok€j)$j] are both sequences of 
martingale differences, ft follows from the law of large numbers for martingales that, 

m m 

- E x *tJti = - E E ft^J^i^-'] + °w 



i=i 



3 = 1 



and 



It follows that 



1 " l 

-DCE^^O^]) 1.^+0(1) a.,. 

3=1 
1 m 



(B.3) 



3=1 



O m .k — 8k 



min y(E[Tv k (x,€)£ T £])y T - o(l) 

|i/||=i,a>ee* 



^ m 

<(^m,k — #fc) [ 'E ^J.fc^J^J'] (Qrn.k ~ Ok) 

3=1 

1 m ~ T ^ 

= — Xj^iY^k — 6k^j)$,j (9m, k — Ok) — o(l)||0 m ,fe — 0fc|| a.s. 

3 = 1 

Hence 

flm.t — > O.S. 

Now, by l|B^ and {eOJ, 

1 m 

771 3=1 

which, together with 1B.2L implies that 

1 - 

O m ,k - Ok = - E *J.fc( y 3,fc - Okg)$jI^ k {l + o(l)) a.s. 



(B.4) 



3=1 



Notice E[(F J - fc - fc £j)£j|&] = 0. JH is satisfied with = (Y k - fc £ T )£I^. 
Proof of Remark 13.31 Now, the model is 



Yj = x iAVi + XjafJ-2 + P£j + tj 



Let 



S(jJ,l,lA,/3*) ^E^J _X 3.lMi ~Xj,2H2 -p*z. 

3 = 1 



Wirte ?7j = -Xj,2) £) an d 6 = M2, /3). With the same argument of proving 

Corollary 13.41 we have that the LSE m exists and is unique, and further, satisfies the 
equation: 



{Om - 0) 



-E^b =-E e ^- 



3=1 



3=1 
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Write 



1(9) 



Tri(i^) 




a 7r 2 ( 



iri(n^)a\ 



Notice the assignment probability at stage j depends only on the estimator 0j-\ and 



does depend on the covariate £j. It follows that v\ — 7Ti( M1 T M2 ), «2 = 7T2( — — —' 



and 

So, similar to 1B.4L 
Further 

where I = 1(9) and 



- m 1 m 



3=1 



3=1 



a.s. 



1 m 

L-^-y^rVl+otl)) a.s., 



3=1 



aJ^a 1 -aiT lN 
air 1 ^ l/v 2 +a/r 1 a T -a/f 1 



-1„T 



-I~ l a 



-I~ L a 



It follows that 



1 m 

- 9) = -= V + o P (l) % N(0, 1- 1 ). 

3 = 1 

From 1B.5L it follows that 

/ 1 m 1 m \ 

- Mm2 = V £jX,M V £jXj,2 (1 + o(l)) a.. 

\ nt)i ^— ' nv2 z — ' I 

\ 3=1 3=1 / 



The remainder of the proof is similar to that of Theorem 12. II 



(B.5) 
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